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Abstract 


The increase in the use of molecular methodologies in systematics has driven the necessity for a comprehensive understanding of 
the limitations of different genetic markers. Not every marker is optimal for all species, which has led to multiple approaches in the 
study of the taxonomy and phylogeny of polyclad flatworms. The present study evaluates base-substitution rates of nuclear ribosom- 
al (JSS rDNA and 28S rDNA), mitochondrial ribosomal (/6S rDNA), and protein-codifying (cytb, cox]) markers for this taxonomic 
group, with the main objective of assessing the robustness of these different markers for phylogenetic studies. Mutation rates and Ti/ 
Tv ratios of the other markers were assessed for the first time. We estimated substitution rates and found cytb to be the most variable, 
while 78S rDNA was the least variable among them. On the other hand, the transition to transversion (Ti/Tv) ratio of the different 
genes revealed differences between the markers, with a higher number of transitions in the nuclear gene 28S and a higher number of 
transversions in the mitochondrial genes. Lastly, we identified that the third codon position of the studied protein-codifying genes 
was highly variable and that this position was saturated in the cox] marker but not in cytb. We conclude that it is important to assess 
the markers employed for different phylogenetic levels for future studies, particularly in the order Polycladida. We encourage the 
use of mitochondrial genes cytb and /6S for phylogenetic studies at suborder, superfamily, and family levels and species delimita- 
tion in polyclads, in addition to the well-known 28S and cox/. 
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Introduction 


Fast and reliable DNA sequencing has become a routine- 
ly used methodology in the description and barcoding of 
new species. In particular, a fragment of the mitochon- 
drial gene cytochrome oxidase c subunit 1 (cox/) has 
become the most frequently used marker for molecular 
identification-based DNA barcoding (Hebert et al. 2003) 
in the majority of species across all taxa, incentivised by 
the Barcode of Life initiative (www.barcodeoflife.org) 


(Ratnasingham and Hebert 2007). Recent studies show, 
however, that genome-wide nucleotide substitution pat- 
terns in coding sequences have species-specific features 
and are variable among evolutionary lineages (Zou and 
Zhang 2021), leading to the question of the ubiquity of 
their use of particular nuclear and mitochondrial genes 
for systematics. 

To address this issue, we investigated transition bias, 
which involves analysing the frequency and nature of 
nucleotide changes between purines and pyrimidines 


Copyright Cuadrado, D. et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, 
distribution, and reproduction in any medium, provided the original author and source are credited. 


864 


across species genomes. This information is crucial for 
understanding the behaviour of different markers com- 
monly employed in phylogenetic studies. Nucleotide 
changes between purines (adenine, A, and guanine, G) 
and pyrimidines (cytosine, C, and thymine, T) are known 
as transitions, whereas changes between a purine and a 
pyrimidine are coined transversions. Due to the dispar- 
ity in the number of types of each possible nucleotide 
change (four types of transitions compared to eight types 
of transversions), the expected number of transitions rel- 
ative to that of transversions (Ti/Tv ratio) would be 0.5 in 
DNA sequence evolution, assuming all types of nucleo- 
tide changes had equal rates of occurrence. However, T1/ 
Tv often exceeds 0.5 or even 1, a phenomenon known as 
transition bias (Nei and Kumar 2000; Yang 2006). Ti/Tv 
bias is commonly considered for estimating nucleotide 
substitution rates, inferring molecular phylogenies, and 
testing for natural selection (Kimura 1980; Tamura and 
Nei 1993; Yang et al. 1998) and has been extensively stud- 
ied in model organisms such as the yeast Saccharomyces 
cerevisiae (Liu and Zhang 2019), the common fruit fly 
Drosophila melanogaster (Schrider et al. 2013), the flow- 
ering plant Arabidopsis thaliana (Ossowski et al. 2010), 
and the nematode Caenorhabditis elegans (Denver et al. 
2009). These studies suggest that transitions are less del- 
eterious and less likely to be purged by natural selection 
than transversions, which could be a reason why transi- 
tions are more commonly found. Furthermore, studies on 
genome error correction show that, due to the structure of 
the genetic code, transversions often lead to non-synon- 
ymous mutations compared to transitions, which usually 
lead to synonymous mutations, thereby potentially affect- 
ing the function and phenotype of the encoded proteins 
(Zhang 2000; Schrider et al. 2013). Therefore, while tran- 
sitions are more frequent than transversions, especially at 
lower taxonomic levels, transversions are considered less 
informative and more difficult to interpret, potentially 
leading to homoplasy effects (evolutionary convergence) 
when comparing distantly related species in parsimo- 
ny-based phylogenies (Broughton et al. 2000). 

Understanding relationships among closely related 
taxa at a species level is essential for conserving biodiver- 
sity, maintaining ecosystem functioning, and understand- 
ing macroevolutionary processes (Oliver et al. 2015). Ex- 
ternal morphological characteristics are historically used 
as diagnostic features for species identification; however, 
contrasting results among morphological and molec- 
ular analyses appear across the entire animal kingdom, 
including nemerteans (Strand and Sundberg 2005), cor- 
als (Forsman et al. 2009), molluscs (Valdés et al. 2017; 
Fernandez-Alvarez et al. 2020), polychaetes (Kupriyano- 
va et al. 2023), fish (Park et al. 2020), insects (Selivon et 
al. 2005; Zhang et al. 2021), and also flatworms (Litvaitis 
et al. 2019). 

Flatworms (order Polycladida) are free-living, carniv- 
orous organisms that occur in a diversity of marine hab- 
itats, with over 800 species described worldwide (Tyler 
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et al. 2006-2024). Exploring the diversity of polyclad 
species 1s critical, considering recent studies indicating 
the importance of the chemical and ecological roles of 
flatworms (Rawlinson and Stella 2012; Gammoudi et al. 
2016; McNab et al. 2021, 2022; Tosetto et al. 2023). Tra- 
ditionally, the taxonomy and phylogenetics of the order 
Polycladida have been based on morphological charac- 
teristics, where differences in tentacles, eyespots, ven- 
tral sucker, and genitalia are used to classify polyclads 
into different genera and families (Faubel 1983, 1984; 
Prudhoe 1985). External morphological characters are, 
however, not always an accurate reflection of the evolu- 
tionary relationships in flatworms. For example, differ- 
ent families of Leptoplanoidea (Acotylea) display very 
similar external morphologies but show different and 
distinguishable features internally and molecularly (Ba- 
hia 2016; Dittmann et al. 2019). Sometimes species with 
few morphological differences show large molecular dis- 
crepancies (Carrera-Parra et al. 2011), and the problem is 
exacerbated when different cryptic polyclad species live 
in sympatry, thereby complicating accurate identification 
and potentially resulting in the amalgamation of multiple 
species into a single one. It is therefore important to iden- 
tify which molecular markers are best suited to resolving 
the evolutionary lineages of flatworms. 

A variety of molecular markers have been used to date 
for the systematic analysis of polyclads. Resolution of 
deep nodes such as suborders (Cotylea and Acotylea) and 
assessment of differences in superfamilies and families 
have initially been based on the 28S rDNA marker (Litvai- 
tis and Newman 2001; Litvaitis et al. 2010; Rawlinson et 
al. 2011; Bahia et al. 2017; Cuadrado et al. 2021). Recent 
studies have, however, noted deficiencies in this marker 
(Dittmann et al. 2019; Litvaitis et al. 2019), because only 
a section of the phylogenetic tree topologies in Cotylea is 
consistently reconstructed. In the case of suborder Acoty- 
lea, despite recent studies (Oya and Kayihara 2020), there 
is aneed for the inclusion of more taxa, additional genetic 
markers, complete markers, and/or searching for other al- 
ternatives to enhance understanding. 

Other polyclad studies have used a range of different 
molecular markers, often employing specific primers 
due to performance issues with universal primers, such 
as coxl, the /6S mitochondrial ribosomal subunit (/6S 
rDNA), the mitochondrial cytochrome 6 (cytb), and the 
nuclear /8S rDNA (Vella et al. 2016; Aguado et al. 2017; 
Tsunashima et al. 2017; Oya and Kajihara 2017, 2020; 
Oya et al. 2019; Tsuyuki et al. 2019, 2022; Cuadrado 
et al. 2021; Rodriguez et al. 2021), as well as complete 
mitochondrial genomes (Aguado et al. 2016; Kenny et 
al. 2019; Yonezawa et al. 2020) for both systematics and 
species delimitation. 

This study evaluates the strength of support provid- 
ed by cox], 16S rRNA, and cytb mitochondrial genes, as 
well as the SS rDNA and 28S rDNA nuclear genes, on 
the phylogeny of the Polycladida through the study of nu- 
cleotide substitutions. 
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Materials and methods 
Sampling sites and processing of materials 


Polyclad flatworms were collected from different sites 
along the coasts of eastern Australia, the Iberian Penin- 
sula, the Canary Islands, Cape Verde, Costa Rica, Cy- 
prus, and Martinique Island (Table 2). This broad dis- 
tribution range included representation of the majority 
of superfamilies across the order Polycladida, including 
Pseudocerotoidea Faubel, 1984; Prosthiostomoidea Ba- 
hia, Padula, & Schrodl, 2017 for the suborder Cotylea; 
Leptoplanoidea Faubel, 1984; Stylochoidea Poche, 1925; 
and Discoceloidea Laidlaw, 1903 for the Acotylea sub- 
order. These species stem from a compilation of avail- 
able biological material from recent studies (Norefia et 
al. 2014, 2015; Marquina et al. 2015a, 2015b; Aguado et 
al. 2017; Pérez-Garcia et al. 2019; Cuadrado et al. 2021; 
Rodriguez et al. 2021; Soutullo et al. 2021), with the aim 
of achieving the greatest possible representativeness and 
sequencing of all available samples. 

Flatworms were collected from under rocks in coastal 
environments, either by hand for intertidal and shallow 
individuals or using SCUBA in deeper areas, and placed 
in separate containers filled with seawater (specific in- 
formation on species is available in the bibliography of 
Table 2). After being transported to a laboratory, a small 
piece of tissue (<1 g) was removed from the body margin 
of each individual using a sterile scalpel blade. The tissue 
of each animal was fixed in absolute ethanol and stored 
for DNA extraction. Each animal was then coaxed onto 
a piece of paper and transferred to a Petri dish contain- 
ing clean, frozen seawater, where it was fixed with either 
10% formalin or Bouin’s liquid. Once the fixation process 
was complete, specimens were stored in 70% ethanol for 
species identification through morphological techniques, 
as per Rodriguez et al. (2021). 


DNA extraction and amplification 


Total genomic DNA was extracted from each tissue sam- 
ple using an Isolate II Genomic DNA Kit (Meridian Bio- 


Table 1. Primers used in this study. 
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science®) following the manufacturer’s protocol. Ampli- 
cons from two nuclear (28S rDNA, 18S rDNA) and three 
mitochondrial (J6S rRNA, cox/, and cytb) target genes 
from each polyclad species were sequenced. All poly- 
merase chain reactions (PCRs) were performed using Taq 
DNA polymerase (Qiagen). The reaction mix included: 
HO — 10.92 ul; 10x buffer — 2 pl; 25 mM MgCl, — 4 ul; 
0.5 mM dNTP — 1 ul; 10 uM primer — 0.25 ul /primer; 
Taq 5 U/ul — 0.08 ul; DNA — 1.5 ul. This gave a reaction 
volume of 20 ul. 

Sequences of approximately 1100 base pairs (bp) 
(28S), 800 pb (J8S), 500 bp (J6S), 1000 bp (cox/), and 
400 bp (cytb) were amplified using the primers listed in 
Table 1. The PCR consisted of an initial denaturation at 
95 °C for 3 min, followed by 40 cycles of denaturation at 
95 °C for 1 min, annealing at 47 °C (cytb), 49 °C (cox), 
59 °C (28S rDNA, 18S rDNA, 16S rRNA) for 30 sec, and 
extension at 72 °C for 1 min, with a final extension of 
10 min at 72 °C. 

The PCR products were observed using TBE gel elec- 
trophoresis in 1.5% agarose gel stained with SYBER Safe 
and visualised under UV light. PCR products were sent to 
Macrogen Korea for clean-up and sequencing. Lastly, the 
obtained forward and reverse sequences were combined 
using the programme Geneious Prime 2020.2.4 (http:// 
www.geneious.com, Kearse et al. 2012) using the align- 
ment-transition/transversion with the consensus sequence 
tool and manually curated. 

The species with the highest possible number of cor- 
rectly sequenced genes was selected to compare the anal- 
yses performed on the different markers. All sequences 
obtained in the present study have been deposited in the 
GenBank database under the accession numbers listed 
in Table 2. 


Comparison of genetic markers 


Alignments of each molecular marker were performed 
with the Clustal W algorithm (Larkin et al. 2007) us- 
ing the programme Geneious Prime 2020.2.4. Ambigu- 
ously aligned and variable regions were recognised and 
excluded using the programme Gblocks version 0.91b 


Gene Primer name Sequence Reference 
18S 18SF2 ACTTTGAACAAATTTGAGTGCTCA Morgan et al. (2003) 
1800mod GATCCTTCCGCAGGTTCACCTACG Raupach et al. (2009) 
28S Platy28S_F AGCCCAGCACCGAATCCT Cuadrado et al. (2021) 
Platy28S_R GCAAACCAAGTAGGGTGTCGC Cuadrado et al. (2021) 
16S PLATYS16SF1 ACAACTGTT TATCAAAAACAT Aguado et al. (2017) 
PLATYS16SR1 ACGCCGGTYTTAACTCAAATCA Aguado et al. (2017) 
cox] HRpra2 AATAAGTATCATGTARACTDATRTCT Tsunashima et al. (2017) 
HRprb2-2 GDGGVTTTGGDAATTGAYTAATACCTT Tsunashima et al. (2017) 
Acotylea_COl_F ACTTTATTCTACTAATCATAAGGATATAGG Oya and Kajihara (201 7) 
Acotylea_COI_R CTTTCCTCTATAAAATGTTACTATTTGAGA Oya and Kajihara (201 7) 
cytb cytb424-444 CAGGAAACAGCTATGACCGGWTAYGTWYTWCCWTGRGGWCARAT Jondelius et al. (2002) 


cytb876-847 


Jondelius et al. (2002) 


TGTAAAACGACGGCCAGTGCRTAWGCRAAWARRAARTAYCAYTCWGG 
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Family Species 18S 288 16S coxl cytb Locality Reference 
Discoceloidea 
Cryptocelidae Cryptocelis sp. MZ292810 MZ292829 MZ292858 MZ273073 PP856191 _~ Galicia, Spain _Norena et al. (2015) 
Discocelidae Discocelis tigrina MZ292799 MK299370 - PP856182 Cuadrado et al. (2021) 
Leptoplanoidea 
Gnesiocerotidae = Echinoplana MW376754 MW377507 MW376599 MW375911 MW392971 New South Rodriguez et al. 
celerrima Wales, Australia (2021) 
Ceratoplana MW376740 MW377493 MW376585 MW375897 MW392973 Victoria, Rodriguez et al. 
falconerae Australia (2021) 
Parabolia megae MW376744 MW377497 MW376589 MW375901 MW392974 New South Rodriguez et al. 
Wales, Australia (2021) 
Leptoplanidae Leptoplana sp. MZ292828 MZ292853 MZ273072 Cape Verde Cuadrado et al. (2021) 
Island 
Parviplana geronimoi MZ292807 MZ292855 Cadiz, Spain Pérez-Garcia et al. 
(2019) 
Notoplanidae Notoplana australis —§ MW376750 MW377503 MW376595 MW375907 MW392986 New South Rodriguez et al. 
Wales, Australia (2021) 
Notoplana felis MW376753 MW377506 MW376598 MW375910 MW392985 Victoria, Rodriguez et al. 
Australia (2021) 
Pleioplanidae Pleioplana atomata = MZ292820 MZ292832 MZ292866 MZ273074 PP856198 _ Asturias, Spain Marquina et al. 
(2015a) 
Pleioplana sp. MZ292808 MZ292840 MZ292856 MZ273079 PP856189 Cadiz, Spain This study 
Pseudostylochidae Jripylocelis typica MW376752 MW377505 MW376597 MW375909 MW392983 New South Rodriguez et al. 
Wales, Australia (2021) 
Stylochoplanidae Stylochoplana clara MW376741 MW377494 MW376586 MW375898 MW392972 Victoria, Rodriguez et al. 
Australia (2021) 
Stylochoidea 
Callioplanidae Callioplana MW376747 MW377500 MW376592 MW375904 MW392984 New South Rodriguez et al. 
marginata Wales, Australia (2021) 
Neostylochus MW376748 MW377501 MW376593 MW375905 New South Rodriguez et al. 
ancorus Wales, Australia (2021) 
Latocestidae Eulatocestus MW376749 MW377502 MW376594 MW375906 New South Rodriguez et al. 
australis Wales, Australia (2021) 
Latocestus plehni MZ292806 MK299376 MZ292852 PP856187 Cape Verde  Cuadrado et al. (2021) 
Island 
Planoceridae Paraplanocera MW376745 MW377498 MW376590 MW375902 MW392981 New South Rodriguez et al. 
marginata Wales, Australia (2021) 
Paraplanocera Sp. MZ292818 MZ292833 MZ292868 MZ273075 PP856200 Cyprus This study 
Planocera edmondsi MW376755 MW377508 MW376600 MW375912 MW392979 Victoria, Rodriguez et al. 
Australia (2021) 
Planocera pellucida MZ292797 MK299355 PP856180 Canary Island, Cuadrado et al. (2021) 
Spain 
Idioplanidae Idioplana MW376746 MW377499 MW376591 MW375903 MW392980 New South Rodriguez et al. 
australiensis Wales, Australia (2021) 
Stylochidae Imogine fafai MZ292817 MZ292835 MZ292865 MF371138 PP856197 Asturias, Spain Aguado et al. (2017) 
Leptostylochus MW376742 MW377495 MW376587 MW375899 MW392982 New South Rodriguez et al. 
victoriensis Wales, Australia (2021) 
Stylochus MZ292800 MZ292841 MZ292846 MF371141 PP856183 _ Galicia, Spain Aguado et al. (2017) 
neapolitanus 
Boninioidea 
Boniniidae Boninia sp. MZ292819 MZ292834 MZ292869 - PP856201 Costa Rica Soutullo et al. (2021) 
Cestoplanidae Cestoplana MW376751 MW377504 MW376596 MW375908 MW392977 New South Rodriguez et al. 
rubrocincta Wales, Australia (2021) 
Pericelidae Pericelis beyerleyana MZ292801 MK299374 MZ292847 PP856184 Martinique Island Cuadrado et al. (2021) 
Pericelis cata MZ292805 MK299352 MZ292851 Cape Verde Cuadrado et al. (2021) 
Island 
Prosthiostomoidea 
Prosthiostomidae Prosthiostomum MW376743 MW377496 MW376588 MW375900 MW392978 New South Rodriguez et al. 
amri Wales, Australia (2021) 
Prosthiostomum MZ292816 MZ292836 MZ292864 MZ273080 PP856196 Almunécar, Pérez-Garcia et al. 
siphunculus Spain (2019) 
Prosthiostomum sp. MZ292795 MZ292826 MZ292842 MZ273071 New South — Rodriguez et al. (2021) 
Wales, Australia 
Enchiridium magec MK299349 MZ292844 PP856179 Canary Island, Cuadrado et al. (2021) 


Pseudocerotoidea 
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Family Species 18S 288 16S cox] cytb Locality Reference 

Euryleptidae Eurylepta cornuta MZ292809 MZ292839 MZ292857 MF371139 PP856190 _ Galicia, Spain Aguado et al. (2017) 
Eurylepta guayota MZ292804 MK299372 MZ292850 - PP856186 Martinique Island Cuadrado et al. (2021) 
Prostheceraeus MZ292811 KY263688 MZ292859 MZ273078 PP856192 = Galicia, Spain Norefa et al. (2014) 
roseus 

Pseudocerotidae = Phrikoceros sp. MZ292796 MZ292827 MZ292843 PP856178 Victoria, Rodriguez et al. (2021) 

Australia 

Pseudoceros MZ292813 MZ292837 MZ292861 PP856194 Lizard Island, Marquina et al. 
depiliktabub Australia (201 5b) 
Pseudoceros MZ292812 MZ292838 MZ292860 MF371147 PP856193  LizardIsland, Aguado et al. (2017) 
stimpsoni Australia 
Pseudoceros MZ292798 MK299381 MZ292845 MZ273076 PP856181 Canary Island, Cuadrado et al. (2021) 
velutinus Spain 
Pseudoceros MK299357 = MZ292854 PP856188 Cape Verde Cuadrado et al. (2021) 
rawlinsonae var. Island 
galaxy 
Pseudobiceros MZ292814 MZ292830 MZ292862 PP856195 _Lizard Island, Marquina et al. 
flowersi Australia (2015b) 
Pseudobiceros MZ292815 MZ292831 MZ292863 Lizard Island, Marquina et al. 
hymanae Australia (201 5b) 
Pseudobiceros MZ292803 MK299378 MZ292849 MZ273077 PP856185 Martinique Island Cuadrado et al. (2021) 
caribbensis 
Thysanozoon MZ292802 MK299383 MZ292848 Martinique Island Cuadrado et al. (2021) 
alagoensis 
Thysanozoon MW376738 MW377491 MW376583 MW392976 Victoria, Rodriguez et al. 
brocchii Australia (2021) 
Yungia aurantiaca MK299386 MZ292867 PP856199 Cadiz, Spain Cuadrado et al. (2021) 

(Castresana 2000) with relaxed parameters (smaller fi- Results 


nal blocks). This resulted in matrices of 521 bp (cox/), 
500 bp (16S rRNA), 393 bp (cytb), 1047 bp (28S rDNA), 
and 859 bp (J8S rDNA). 

A supplementary entropy analysis was also performed 
with IQ-TREE version 1.6.12 (Trifinopoulos et al. 2016) 
to quantify the genetic variability across the length of the 
obtained sequences and assess the grade of conservation 
of each marker (entropy estimation by site). 

The saturation rate of the substitutions of each genetic 
marker was quantified through a transition (T1) and trans- 
version (Tv) saturation graph using PAUP* Version 4.0a 
(Build 166) (Swofford 2003), as well as the distribution 
of variable sites and grade of genetic variability by site 
along the genes’ matrices with an entropy analysis using 
DAMBE 5 (Xia 2013). Interspecific distances for each 
gene were calculated in Mega 6 (Tamura et al. 2013). 

Maximum likelihood (ML) analysis was performed 
with IQ-TREE (Trifinopoulos et al. 2016). The optimal 
substitution model selected by the Bayesian information 
criterion (BIC) proposed by the ModelFinder (Kalyaana- 
moorthy et al. 2017) was GITR+F+I+G4 (6S rDNA, 
coxl), TIM+F+I+G4 (cytb), K2P+I (JSS rDNA), and 
TIM3+F+I+G4 (28S rDNA). The consensus tree of 1000 
standard bootstrap pseudo-replicates was selected and 
edited with iTOL version 4 (Letunic and Bork 2019). A 
node was considered well supported when the bootstrap 
value was 80% or greater. Phylogenies without outgroups 
have been analysed to avoid including inconsistencies 
since it was not possible to obtain a common outgroup 
for the five markers studied. 


Entropy estimation by site 


Entropy analysis revealed genetic variability across the 
length of the obtained sequences and assessed the grade 
of conservation of each marker. The variable positions 
of each studied gene presented a continuous distribution, 
with substitutions unequally distributed in the nuclear 
genes. 18S rDNA presented 58 out of 859 (6.75% of the 
alignment) variable positions (37 parsimonies informa- 
tive, PIs), while 28S rDNA presented 388 out of 1047 
(37.0%) variable positions (306 PIs). /6S rDNA present- 
ed 322 out of 500 (64.4%) variable positions (286 PIs), 
while cytb presented 234 out of 393 (59.54%) variable 
positions (218 PIs), and cox/ presented 293 out of 521 
(56.2%) variable positions (280 PIs) (Table 3, Fig. 1A). 


Table 3. Genetic variability of the analysed sequences. 


Gene Average Min Max Ss Cs Pls 
distance distance distance 
(%) (%) (%) 

18S 1.37 0.00 3.14 859 58(6.75%) 37 
rDNA 
28S 11.21 0.00 sell 1047 388 (37.0%) 306 
rDNA 
16S rRNA = 22.06 0.28 32.77 500 322 (64.4%) 286 
cytb 26.86 0.00 34.40 393 234(59.5%) 218 
cox] 24.86 0.22 34.44 521 293(56.2%) 280 


The minimum distance was calculated as the minimum divergence of all sequenc- 
es; the maximum distance was calculated as the maximum divergence of all se- 
quences; Cs: number of constant sites; S: total number of sites in the matrix; and 
PIs: number of parsimony informative sites. 
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Figure 1. Genomic analysis of the studied genes. A. Entropy estimation by site: The X-axis indicates the number of sequenced po- 
sitions, and the Y-axis indicates the number of variations of each position; B. Estimation of substitution rates in absolute values: The 
X-axis displays the pairwise genetic distance between sample pairs; the Y-axis indicates the number of mutations in absolute values; 
C. Estimation of Ti/Tv in pairwise sequence comparisons: The X-axis shows the pairwise genetic distance between sample pairs, and 
the Y-axis shows the Ti/Tv proportion; D. Estimation of transitions and transversions in pairwise sequence comparisons: The X-axis 
indicates the pairwise genetic distance between sample pairs, and the Y-axis indicates the proportion of transitions and transversions. 


The variable sites of each codon position of the pro- 
tein-codifying genes (cytb and cox/) were also assessed. 
The third codon position presented the highest values 
of interspecific maximum distances in both markers: 
69.41% in cytb and 53.83% in cox/. On the other hand, 
the second codon position had the lowest values of maxi- 
mum distances, with 16.66% in cytb and 13.42% in cox] 
(Table 4). 


Table 4. Genetic variability of the analysed sequences of cytb 
and cox! by codon position. 


Gene Average Min Max 
distance (%) distance (%) distance 
(%) 
cytb first codon position 20.38 0.00 30.58 
cytb second codon position 8.88 0.00 16.66 
cytb third codon position Pl fl 0.00 69.41 
cox1 first codon position L625 0.65 30.06 
cox1 second codon position 5,51 0.00 13.42 
cox1 third codon position 53.83 2.02 53.83 


The minimum distance was calculated as the minimum divergence of all sequenc- 
es; and the maximum distance was calculated as the maximum divergence of 
all sequences. 


Estimate of substitution rate in absolute values 


A total of 1485 (J&S rDNA), 2556 (28S rDNA), 1653 
(16S rDNA), 1176 (cytb), and 703 (cox/) pairwise com- 
parisons from 43 (J8S rDNA), 46 (28S rDNA), 45 (16S 
rDNA), 39 (cytb), and 30 (cox!) species were performed. 
Fig. 1B shows the number of substitutions in absolute 
values (abs) plotted against the pairwise distance between 


zse.pensoft.net 


each sample. All cases presented a linear growth follow- 
ing these equations: 
18S rDNA: 
y =- 16.054x? + 858.37x — 8E-05 
R? = 1.0000 
28S rDNA: 
y =- 760.97x? + 1123.5x — 3.5153 
R? = 0.9743 


16S rDNA: 


y =- 114.54x? + 462.66x — 1.4178 


R? = 0.9766 
cytb: 
y = 30.873x? + 366.5x + 0.3227 
R? = 0.8436 
cox]: 


y =- 43.538x? + 524.81x + 0.1409 


R?=0.9901 
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The coefficient of determination (R’) was close to 1 
in most cases, indicating that all values were close to a 
linear progression except for the cytb mitochondrial gene 
(R? = 0.84). 


Estimates of the transition/transversion ratio 
(Ti/Tv) in pairwise sequence comparisons 


The estimated Ti/Tv ratios plotted against the estimated se- 
quence distances showed the Ti/TV ratio plotted against the 
pairwise distance between each sample (Fig. 1C). Two dif- 
ferentiated regions can be observed: the first was a region 
where the number of transitions and transversions random- 
ly appeared with great variation. Due to the short distances 
between phylogenetic closely related species and the differ- 
ent numbers of transversions and transitions that each pair 
presented, the estimation showed disparate values depend- 
ing on the selected samples, predominating the number of 
transitions, as they are the most probable among closely 
related species. As the distance between species pairs in- 
creased, a second region where the values stabilised around 
1 (where 1 indicates the same number of transversions and 
transitions) appeared. While the value was slightly higher 
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than 1 in most cases (indicating a greater number of transi- 
tions over transversions), starting from pairwise distances 
greater than 20%, the number of transversions increased 
compared to that of transitions in the case of the 16S rDNA 
and cox/ mitochondrial genes. Meanwhile, 28S rDNA pre- 
sented a Ti/Tv ratio between 1 and 2 at longer distances, 
indicating an overall higher number of transitions. 


Estimates of transitions and transversions in 
pairwise sequence comparisons 


Congruent with the results of the Ti/Tv ratio, the ini- 
tial number of transitions was higher than that of trans- 
versions for all gene markers. However, the number of 
transversions was greater at higher distances across all 
markers, as observed in the graphs, except for 28S rDNA, 
where transitions remained higher (Fig. 1D). 

Differences among the three codon positions were ev- 
ident (Fig. 2). The first codon position displayed maxi- 
mum distances of 30.58% for cytb and 30.06% for cox /, 
compared to the maximum distances for the second co- 
don position (16.66% and 13.42%, respectively) and 
those of the third codon position (69.41% and 70.94%). 


vr, 


—— Poly. (Ti3) ———Poly. (Tv3) 


Ti2 T2 
@® ib * vs 


Poy. (Til) ———Pol. (Tvl) 


@ Ti2 @ Tv2 ———Pol.(Ti2) ——Pol. (Tv2) 


Figure 2. Estimation of transitions and transversions for each codon position (from top to bottom: first (1), second (2), and third (3) 
codon positions) in cytb (A) and cox/ (B). The X-axis indicates the pairwise genetic distance between sample pairs, and the Y-axis 


indicates the proportion of transitions and transversions. 
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In both markers, the overall number of transversions was 
higher than that of transitions, apart from the first codon 
position, where the number of transitions was always 
higher than that of transversions. The second codon posi- 
tion displayed a lower mutation rate at shorter distances. 
Lastly, the third codon position presented a higher num- 
ber of overall mutations (both transitions and transver- 
sions), with a higher proportion of transversions in both 
markers; however, a decrease in transition in the cox/ 
gene was observed at pairwise distances higher than 25%. 
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Maximum-likelihood phylogenetic analyses 
The matrices employed to analyse substitution ratios pro- 
vided the following phylogenetic results through a max- 


imum-likelihood analysis performed for each gene (Figs 
3, 4). The results obtained for each marker are: 


ISS rDNA (Fig. 3): This marker showed the separa- 
tion of the two suborders of Polycladida (Cotylea 
and Acotylea) with a bootstrap support (BS) of 97. 
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Figure 3. Maximum-likelihood phylogenetic analysis of nuclear gene markers (/8S and 28S). 
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The superfamily Pseudocerotidae (Cotylea) was high- 
ly supported (BS = 100), as were the genera Pseudoc- 
eros (BS = 80) and Pseudobiceros (BS = 79). The su- 
perfamilies and families of Acotylea appeared without 
strong support (BS < 70). 


28S rDNA (Fig. 3): In this study, the two suborders were 


well supported (BS = 100). Within Cotylea, the four 
analysed superfamilies were well delimited and held: 
Periceloidea and Boninioidea produced two _ inde- 
pendent lineages, and Pseuderotoidea (BS = 94) and 
Prosthiostomoidea (BS = 100) were highly supported. 
Within the last superfamily Pseudocerotidae (includ- 
ing the genera Pseudoceros, Pseudobiceros, Thysa- 
nozoon, and Phrikoceros, BS = 98), an independent 
cluster for the family Euryleptidae was seen. Similarly, 
within the Acotylea suborder, the different superfam- 
ilies Leptoplanoidea and Stylochoidea showed high 
support values (BS = 100 and 88, respectively). Fam- 
ilies such as Styochoplanidae (BS = 100), Leptoplani- 
dae (BS = 99), Latocestidae (BS = 93), and Stylochi- 
dae (BS = 100) were also well supported. 


16S rRNA (Fig. 4): This marker provided robust support 


for the suborders Cotylea and Acotylea and good res- 
olution for the Cotylean superfamilies Periceloidea 
(BS = 100), Prosthiostomoidea (BS = 99), and Pseudoc- 
erotoidea (BS = 98). The two largest Cotylean super- 
families (Prosthiostomoidea and Pseudocerotoidea) 
were grouped in a clade with a bootstrap support of 95. 
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Within Acotylea, the superfamily Stylochoidea was 
not supported (BS = 75), and the superfamily Lepto- 
planoidea did not form a monophyletic assemblage. As 
a result, at the family and genus levels, /6S rRNA did 
not yield clear groups within the leptoplanoids. 


coxl (Fig. 4): This marker is considered the molecular 


“barcode” for the majority of species. In this study, 
support varied depending on the taxonomic level. At 
the suborder level, the support values were lower than 
those from other genes (BS = 77). At the next level, 
the mainly Cotylean and Acotylean superfamilies were 
recovered. The majority of families in both suborders 
did not form monophyletic clusters. 


cytb (Fig. 4): Regarding the last of the studied markers, 


cytb separated the two suborders Cotylea and Acot- 
ylea (BS = 100). It also displayed high support for 
the Cotylean and Acotylean superfamilies. At family 
level, cytb provided good support in both suborders 
(Colylea: Euryleptidae BS = 86 and Pseudocerotidae 
BS = 99; Acotylea: Leptoplanidae BS = 98, Planoce- 
ridae BS = 77, Latocestidae BS = 82, and Stylochidae 
BS = 80), but the majority of Cotylean and Acotylean 
superfamilies were not recovered (Fig. 4). 


All assessed markers placed Cestoplana within or as 


the sister lineage of Cotylea, but none showed an un- 
equivocal phylogenetic or kinship relationship between 
Cestoplana rubrocincta and the other taxa. 
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Figure 4. Maximum-likelihood phylogenetic analysis of mitochondrial gene markers (/6S, cox1, and cytb). 
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Discussion 


This study compares, for the first time, the substitutions 
of mitochondrial and nuclear molecular markers at the 
order level for polyclad flatworms, including represen- 
tatives of all superfamilies within the suborders Cotylea 
and Acotylea. 

Regarding entropy values, it is worth noting the small 
proportion of variable sites in the /SS rDNA nuclear gene 
that denote low phylogenetic values in our analyses of the 
order compared to 28S rDNA, which presented regions 
with clear variability alternating with conserved regions 
(Fig. 1A). This difference is made more apparent when 
compared to the studied mitochondrial markers (/6S 
rDNA, cox1, and cytb), which all presented high variabil- 
ity and substitution rates. In addition, in the three mito- 
chondrial markers, variability was present throughout the 
entire DNA sequence, which is possibly one of the rea- 
sons for the difficulty in creating generic primers for these 
Species, especially in the case of cox/. In most inverte- 
brate taxa, cox] sequencing is possible using universal 
primers such as those designed by Folmer and co-work- 
ers (Folmer et al. 1994) or, more recently, Lobo and col- 
leagues (Lobo et al. 2013). For some taxonomic groups, 
however, these markers do not hybridise, and this appears 
to be the case for most Rhabditophora (Platyhelminthes), 
including those in the order Polycladida (mainly in the 
suborder Cotylea). In this situation, specific primers are 
frequently required (Aguado et al. 2017; Oya et al. 2019; 
Cuadrado et al. 2021). 

The absolute values of substitution rates observed in 
our research reflect a linear increase in variability in all 
cases. A decrease in the absolute mutation rate was only 
observed in cytb, which may have been caused by a cer- 
tain saturation in the signal of sequence substitution due 
to multiple recurrent changes since more than 80% of 
each sequence displayed variability. This saturation trend 
could lead to underestimating the variation in determinate 
terminal taxa. Therefore, tt would be more advisable to 
use this marker for conducting phylogenetic analyses of 
closer groups, such as families or superfamilies. 

The Ti/Tv ratio remained relatively stable for most 
cases, except for 28S rDNA, which presented a higher 
number of transitions at all distances, and cytb, which 
displayed a higher number of transversions. In the case 
of the 28S ribosomal gene, the elevated number of tran- 
sitions 1s most likely due to a lack of conservation of the 
secondary structure of the RNA molecule (Rivas 2021), 
which may be required to preserve its function, with stem 
structures forming at transitions where needed. 

In contrast, the overall increase in transversions in 
mitochondrial genes, particularly in cytb, could be the 
accumulation of substitutions when comparing variable 
sequences very distant from each other. All four types of 
transitions, as opposed to eight types of transversions, 
need to be considered in such situations. Previous stud- 
ies have suggested that, compared with non-synonymous 
transversions, non-synonymous transitions are less dele- 
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terious because they tend not to cause radical changes in 
amino acid physicochemical properties such as charge, 
polarity, and size (e.g., Zhang 2000). However, our re- 
search shows a higher proportion of transitions were ob- 
served for the 28S rDNA nuclear gene in comparison to 
the mitochondrial markers. Meanwhile, /SS rDNA pre- 
sented so few changes that it could hardly indicate a ten- 
dency in the analyses. Our results appear to validate the 
proposition made by Zou and Zhang (2021), who stated 
that the Ti/Tv ratio can be more or less than 1 (1.e., trans- 
versions or transitions being more prevalent) depending 
on the group studied. They attributed variations between 
interchangeable amino acids in protein-coding genes as 
a possible cause for this (e.g., variations in the genetic 
code of different taxa, differences in the functionality of 
generated proteins, etc.). In the case of the phylum Platy- 
helminthes, it is important to point out that the flatworm 
mitochondrial genetic code possesses four variations 
compared to the standard invertebrate mitochondrial 
code. For example, AAA codifies for asparagine (Asn) in 
flatworms, while in the standard mitochondrial code only 
AAT and AAC codify for this amino acid, which leads to 
fixating a transversion in this group. Likewise, the codons 
AGA and AGG translate to serine (Ser) in the flatworm 
mitochondrial genetic code instead of Arg, fixating two 
additional transversions, and UGA codifies for Trp rather 
than being a stop codon (Telford et al. 2000). 

Different patterns of substitutions were also observed 
for the results of the 28S rDNA nuclear gene (Fig. 1B) 
and those present in the mitochondrial genes when com- 
paring transition and transversion rates. The number of 
transitions surpasses the number of transversions in the 
28S rDNA, while the mitochondrial markers show more 
transversions than transitions. The variations in the mi- 
tochondrial genetic code of flatworms mentioned earlier 
could lead to a higher chance of fixating transversions. 
Because of this, we suggest a more exhaustive study on 
this increase in transversions in the mitochondrial DNA 
of polyclads and its implications for Platyhelminthes 
more generally. 

Conspicuous differences were observed when compar- 
ing all codon positions of each of the studied protein-cod- 
ifying genes (cytb and cox/). Saturation of the transver- 
sions was observed in the third codon position of cox/. 
Such saturation has been reported previously for other 
taxonomic groups such as triclads (Alvarez-Presas et al. 
2008), protists (Liu and Zhang 2019), insects (Schrider et 
al. 2013), plants (Ossowski et al. 2010), and nematodes 
(Denver et al. 2009). It is possible that the decrease in the 
signal of the number of transitions at distances higher than 
25% observed in our research could lead to errors in phy- 
logenetic analyses of polyclad flatworms when using the 
cox! genetic marker. A plausible solution to reduce this 
effect during future phylogenetic analyses would be to de- 
lete the third codon position from the alignment. The effec- 
tiveness of this, however, is beyond the scope of this study. 

Based on the results obtained in the ML analysis (Figs 
3, 4), the markers with the best clade support and agree- 
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ment with morphological relationships by histological 
analyses (Faubel 1983, 1984) were 28S rDNA (nuclear) 
and the mitochondrial markers /6S rDNA and cytb. 18S 
rDNA did not offer strong support at any taxonomic level 
studied. Moreover, substantial differences 1n support exist- 
ed within the effective and resolving markers: 28S rDNA 
(nuclear) and 16S rDNA/cytb (mitochondrial). The differ- 
ences observed between mitochondrial and nuclear mark- 
ers, along with their potential incongruences in phyloge- 
netic analyses, have previously been documented in other 
taxa, including Anthozoa, Insecta, and mammals (Zadra et 
al. 2021; Fedorov et al. 2022; Quattrini et al. 2023). 

28S rDNA resolved the majority of nodes well for the 
systematics of suborder, superfamily, family, and, in some 
cases, at the genus and species level in Cotylea. Never- 
theless, the 28S rDNA proved less effective in resolving 
deep nodes within Acotylea, resulting in the formation of 
paraphyletic nodes. 

Within the mitochondrial markers, the best resolution 
level (>75 bootstrap support), compared to current phy- 
logeny (Goodheart et al. 2023), was observed at the ge- 
nus and species level. Both /6S rDNA and cytb strength- 
ened and delimited the genera, resolving specific clusters 
within Cotylea and Acotylea. The specific combinations 
presented in our analyses revealed differences, such as 
the relationship between Cestoplana and Pericelis with- 
in Cotylea or Echinoplana with Leptoplana in Acotylea, 
although these relationships were not recovered by cytb. 
This may be caused by the increased substitution rate 
present among distantly related taxa within the phyloge- 
netic tree (resulting in decreased linear progression of R’) 
and a more complex evolutionary history for these genes 
or taxa. 


Conclusion 


Among the tested markers, cytb presented a higher rate 
of variability and did not show saturation of transitions 
for any codon position. Moreover, this marker present- 
ed the highest range of distances (0% to 34.40%), with 
an average distance of 26.86% compared to that of cox] 
(highest range of distances: 0.22% to 34.44%, average 
distance: 24.86%). 

The use of a common marker for the order Polycla- 
dida would allow direct phylogenetic comparison across 
studies. General primers for these mitochondrial genes 
often fail to hybridise, so we also recommend designing 
de novo cox 1-specific primers for families within the sub- 
order Cotylea and cytb-specific primers for those within 
Acotylea, taking into consideration third base positions. 
The de novo design markers will allow amplification of 
cox! and cytb sequences for certain groups of polyclad 
flatworms that previously could not be analysed due to 
the high number of substitutions across the whole se- 
quence and the lack of conserved regions. 

Thus, for polyclad flatworms, we conclude that for 
future studies at the order level, we encourage the use 
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of mitochondrial genes cytb and 16S rDNA and nucle- 
ar ribosomal genes 28S rDNA. We also encourage the 
use of the cox/ gene with the caution of analysing the 
third codon position to avoid errors in the analyses and 
resolution of deep nodes at a generic or specific level. 
Certainly, the most crucial aspect is to determine the spe- 
cific research inquiry and taxonomic level (such as order, 
family, or genus) and consequently select the appropriate 
genes to better address the study. In the present study, we 
analysed five markers currently used in the resolution of 
phylogenies, kinship analysis, delimitation of species, 
etc. We look forward to future polyclad studies using our 
suggested approach so that we can continue advancing 
the systematics and origin of this taxon on a global scale. 
New sequencing techniques offer the possibility of incor- 
porating additional molecular information if the selected 
genes accurately represent the evolutionary history of the 
species. Concatenating data from different suitable mark- 
ers will further bolster support for the analysed clusters. 

Our case study highlights the need to evaluate how 
well nuclear and mitochondrial genes perform within a 
specific taxonomic group level. We propose that the use 
of transition bias is a useful tool for distinguishing which 
markers may be more effective for any taxon and could 
help streamline success for future systematic studies. It 
would also make cross-study evaluation within a taxo- 
nomic group more effective. A more globally collabora- 
tive approach to molecular systematics would certainly 
facilitate the use of this approach. 
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